Dimensionality Reduction Techniques for Proximity Problems Piotr Indyk Stanford University

ثبت نشده
چکیده

In this paper we give approximation algorithms for several proximity problems in high dimensional spaces. In particular, we give the rst Las Vegas data structure for (1 +)-nearest neighbor with polynomial space and query time polynomial in dimension d and logn, where n is the database size. We also give a deterministic 3-approximation algorithm with similar bounds; this is the rst deterministic constant factor approximation algorithm (with polynomial space) for any norm. For the closest pair problem we give a roughly n 1+ time Las Vegas algorithm with approximation factor O(1== log 1==); this is the rst Las Vegas algorithm for this problem. Finally, we show a general reduction from the furthest point problem to the nearest neighbor problem. As a corollary, we improve the running time for the (1 +)-approximate diameter problem from n 2?O(2) to n 2?O(). Our results are uniied by the fact that their key component is a dimensionality reduction technique for Hamming spaces.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compressive Image Feature Extraction by Means of Folding

We explore the utility of a dimensionality reducing process we term folding for the purposes of image feature extraction. We seek to discover whether image features are preserved under this process and how to efficiently extract them. The application is in size weight and power constrained imaging scenarios where an efficient implementation of this dimensionality reduction can save power and co...

متن کامل

Scalable Techniques for Clustering the Web

Clustering is one of the most crucial techniques for dealing with the massive amount of information present on the web. Clustering can either be performed once offline, independent of search queries, or performed online on the results of search queries. Our offline approach aims to efficiently cluster similar pages on the web, using the technique of Locality-Sensitive Hashing (LSH), in which we...

متن کامل

Near-Optimal (Euclidean) Metric Compression

The metric sketching problem is defined as follows. Given a metric on n points, and > 0, we wish to produce a small size data structure (sketch) that, given any pair of point indices, recovers the distance between the points up to a 1 + distortion. In this paper we consider metrics induced by `2 and `1 norms whose spread (the ratio of the diameter to the closest pair distance) is bounded by Φ >...

متن کامل

Dimension Reduction in Kernel Spaces from Locality-Sensitive Hashing

We provide novel methods for efficient dimensionality reduction in kernel spaces. That is, we provide efficient and explicit randomized maps from “data spaces” into “kernel spaces” of low dimension, which approximately preserve the original kernel values. The constructions are based on observing that such maps can be obtained from Locality-Sensitive Hash (LSH) functions, a primitive developed f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000